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Abstract: We investigate the performance of a generalized linear mixed model in predicting the 
Probabilities of Collision (Pc) for conjunction events. Specifically, we apply this model to the logjo 
transformation of these probabilities and argue that this transformation yields values that can 
be considered bounded in practice. Additionally, this bounded random variable, after scaling, is 
zero-inflated. Consequently, we model these values using the zero-inflated Beta distribution, and 
utilize the Bayesian paradigm and the mixed model framework to borrow information from past and 
current events. This provides a natural way to model the data and provides a basis for answering 
questions of interest, such as what is the likelihood of observing a probability of collision equal to 
the effective value of zero on a subsequent observation. 
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1. Introduction 

The problem of deeiding whether to maneuver a satellite that is in eonjunetion with another spaee 
objeet is often not straightforward, and a serious threat involves the deliberation and eooperation 
of various parties[l]. Quantifying the risk for any sueh eonjunetion is generally aeeomplished 
through the use of the predieted miss distanee at time of elosest approaeh (TCA) and the ealeulated 
probability of eollision (Pc) at that same time. These measurements are generally taken beginning 
up to a week before TCA and eontinue until the opportunity for any aetive remediation of risk 
has passed. The ealeulated Pc values are affeeted by the uneertainty in the positions of the spaee 
objeets, an uneertainty that generally deereases as one approaehes TCA. This deerease in uneertainty 
eventually yields a deerease in Pc for most events, although the rate and manner of deerease is 
different for eaeh. 

In a reeent paper, we investigated the use of a simple method for determining whether any given 
event’s eurrent Pc value is likely to be the peak Pc value for the entire event. [2] We found that this 
method was relatively sueeessful and thus that some trend analysis in Pc value is possible. Though 
the model was eompetent at peak predietion, it did not have the sophistieation to eapture the overall 
behavior of the Pc values in all eases; and that one of the main eauses of its diffieulties was the large 
number of Pc values equal to essentially zero. In this paper, we propose a more sophistieated model 
designed to eapture more sueh nuanees of the data and whieh we ultimately test for predietion of Pc 
values. We also propose a simple method, whieh we name the “Look-Up Method,” as a baseline 
against whieh to evaluate model performanee. The Look-Up Method is intended to represent how 
an analyst might make intuitive judgments regarding the progression of Pc values over time. 
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2. Methods for Trending in Probability of Collision 


2.1. Calculating the Probability of Collision 

The basic procedure for calculating the probability of collision between two space objects is to obtain 
positions and position covariances propagated to TCA, and, using this information, to determine the 
probability of the two objects’ passing within a stated small distance of each other. One generally 
makes assumptions of Gaussian uncertainty distributions, rectilinear motion in the neighborhood 
of the conjunction, and uncorrelated covariance matrices in order to reduce the problem to a more 
manageable two-dimensional formulation. [3] Alfano[4] and others have noted that this probability 
measure tends to follow a canonical progression of measured increase and then untimately rapid 
decrease in Pc as uncertainty in the satellite’s positional data decreases, a trend that was examined 
in our previous paper. 

We seek to model the path of the Pc values for any event as a set of observations for a subject in a 
mixed model. In contrast to the previous paper in which our main focus was peak prediction, here 
our ultimate goal is to predict the value of the next observed Pc measurement, although it is hoped 
that a robust peak prediction capability may flow from this. To verify the accuracy of our model, 
we test it against both a large archive of past conjunction information and the “Look Up” approach 
mentioned previously. 

2.2. Data 

When modeling the trend in Pc values, one is generally concerned with changes in order of mag- 
nitude; thus one generally models logipPc as opposed to the observed Pc values. This poses an 
interesting statistical question, namely the distribution of logjQ^c values. Distribution selection 
is more obvious for the Pc values, as they are bounded between 0 and 1; thus a statistical mod- 
eler generally would choose a beta distribution to model these values (although there are a few 
other less commonly used distributions, such as the simplex distribution, that could be deployed). 
Theoretically, there is no lower bound on logjQ/’c values, as Pc values can be arbitrarily close to 
zero. Operationally, however, one often considers Pc values below lE-10 to be effectively 0. To 
account for this effective terminus in P^ in our previous efforts we “floored” the logjQ/’c values 
at -10, so that the large number of small logjo^c values did not overly influence the model. This 
simplification allows one to focus inference on the operationally relevant logjo^c values, which 
tend to be around -5 and greater. We follow suit here, flooring all logjo^c values at -10. Therefore, 
even in modeling the logiQ^c values, we have bounded data (between -10 and 0) that are “inflated,” 
meaning that there are a notable number of points on the boundary of the defined interval-points 
that may be just beyond what is representable by a hypothesized distribution over that interval. In 
this case, the data are -lO-inflated, but when the variable is rescaled to fit the Beta distribution, the 
data are zero-inflated. Such a situation can be accommodated by the zero-inflated beta distribution; 
the form of the equation and basic definitions are given below, with more elaborate discussion of 
the parameters and their meaning provided in subsequent sections: 
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Here, Ia{-) is the indicator function, so that the first term corresponds to values of y falling between 
0 and 1 (or log Pc values falling between -10 and 0) and the second term corresponds to values equal 
to 0 (or log Pc values equal to -10). The parameter /i is the mean of the Beta distribution, which 
will be modeled in the Generalized Linear Model (GLM) framework, and the parameter (j) is the 
corresponding dispersion parameter, which is a measure of variability. The parameter p can be 
interpreted to be the probability that one observes a 0 (that is, a logPc of -10). 

Additionally, as noted previously, the Pc values for each event tend ultimately to decrease with time 
but at a different rate in each conjunction. This suggests approaching the problem within a mixed 
model framework, allowing random terms for each conjunction. This is a natural approach to take, 
as the data are longitudinal in nature: one observes an overall trend in time; yet each subject (in this 
case, each conjunction) deviates somewhat from this trend, and observations within a subject are 
correlated with each other. In Figure 1 , we visualize the longitudinal nature of the data. We plot the 
logjgPc values of ten events over time, with each events’ values connected by a line; and we also 
plot these values versus the ratio of combined covariance radius to miss distance. This plot exposes 
the canonical trend in Pc development: as the event moves closer to TCA, the covariance shrinks, 
bringing this ratio slowly to a peak and then a marked drop-off. 

It is clear that the trend is more pronounced for the ratio of covariance radius to miss distance. 
Unfortunately, as was discussed in our previous paper, this value is not monotonically increasing or 
decreasing with time (due to unpredictable changes in the covariance size and the estimate of the 
mean miss distance between the two satellites as the event develops); so despite its closer linkage 
to the root phenomenology of the situation, it is actually a less desirable independent variable for 
performing trending and prediction. Therefore, as previously, model construction for P^ trending 
and prediction will use time to TCA as the independent variable. 

We can visualize the trend of the log jq Pc values over time by considering a two-dimensional 
histogram, shown in Figure 2. Recall that we have replaced all logipPc values below -10 with -10. 
This figure indicates that the probability of observing a Pc value of IE- 10 or lower increases as one 
approaches TCA. In fact, at 2 days to TCA, about 40% of events observed have a Pc of lE-10 or 
lower. At 7 days until TCA, the most observed value is about -5, which becomes less frequent over 
time, as more events observe a log^pPc of -10. Interestingly, -5 seems to be the most likely value 
when one does not observe a -10, regardless of the time. We can use this information to construct 
prior information for the model in Equation (1), as the increase of observed -10 values gives us an 
idea of how p behaves over time, and the observed mode of -5 of the log^pPc values above -10 gives 
us some information about the mean of the Beta distribution. 

3. The Bayesian Beta Regression Model 

To model a beta-distribution random variable with reference to a covariate (such as time), it is 
best to use a GLM. Although GLM’s for all other members of the exponential family (Normal, 
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Gamma, etc.) have been developed since 1972[5], the GLM for the beta distribution is relatively 
new, being introduced in 2001 [6]. The reason for this late development is due to the fact that one 
must reparametrize the beta distribution in order to model the mean adequately, an expansion that 
was not explored until recently. We provide the derivation here for completeness. The probability 
density function (pdf) of a random variable X with a beta distribution is generally given as 


/(x|a,^) 


r{a + j5) 
na)r(P) 


(1 


-x)^ \ 


where r(-) is the gamma function. The mean of this distriubtion is E{X) = GLM’s are 
generally specified by setting some function g{ji) of the mean equal to a linear combination of 
covariates. For instance, logistic regression uses the logit link g(/i) = log(j^), which is then set 
equal to a linear combination of covariates, e.g. /3 q + PiX, where X is a covariate, such as time. 
However, as the beta distribution is specified, it is unclear how to model the mean. To facilitate 
direct modeling of the mean, let /i = and ^ = a + /3 . Then we can rewrite the beta pdf as 




r(At<^))r((i-At)<^)) 






Now we may model the mean /i directly. For instance, we may choose the logit link and model 


log ^ ^ ^ ^ ^ — Po + l^lXij + ---+lipXfj, 

so that the log-odds of the mean has a linear relationship to X. Various link functions are possible, 
such as the probit link, the complementary log-log link, and the log link. Our simulations have 
shown that there is no significant advantage in choosing one over the other, so we proceed with the 
logit link, as it is comparatively easy to interpret. 


Recall that for each conjunction, one observes a different progression of Pc values. Sometimes the 
Pc values drop off quickly well prior to TCA, other times they drop off much nearer TCA, and 
sometimes not at all. To model such a behavior, we include a random intercept for each event as 
follows. Let Hij be the mean of the Pc value in the event, scaled to be between 0 and 1. Since 
we have logipFc values bounded between -10 andO, a suitable transformation is = E{Yij)/lO+ 1, 
where Yij is the logjQFc value of the /* Pc value in the event. We may consider the model 

= ^0 + l^iUj + ••• + l^ptfj + bi, 

where bi is the random intercept for the event, and tij is the time until TCA of the /* Pc value 
within the same event. One may additionally consider a random slope or other random effects for 
higher order terms. 


Recall that in Equation (1) we also introduced the parameter p. This parameter controls what 
percentage of the time we observe a zero. In our case, since about a third of our data are zeros, 
p might be close to 1/3. However, we also know that the closer an event approaches TCA, the 
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more likely one is to observe a Pc value that is 0. As a result, we ean also let p depend on our 
eovariate. Sinee it is a probability, this parameter is also bounded between 0 and 1, so we ean also 
use a logit link funetion here (or any of the other aforementioned link funetions). Additionally, we 
may eonsider a random term for this model for eaeh event, as the probability of observing a zero is 
higher for some events than others. Thus, we may eonsider a regression sueh as 

log ^ = OtO + CHhj + ••• + ^p^fj + 

whieh is similar to the regression for /i above. Again, if we wish we ean eonsider other random 
terms, i.e., a random slope. 

3.1. Model Selection 

Given below are some seleeted results from an exploratory model seleetion. To evaluate the relative 
merits of different levels of model eomplexity, we use the penalized devianee eonstruet[7], where 
lower values indieate a better fit. Speeifieally, D{9) is defined as the “Bayesian Devianee,” with 
form 


D(0) = -21ogp(y|0) + 21og/(y), (2) 

where p{y\9) is the likelihood of y given 0 and f{y) is the saturated model, where f{y) = 
p{y\E{Y) =y}. We ean rewrite Z)(0) as 

D(0) = -2(logp(y|0)-log/(y)), (3) 

whieh shows that D(0) is -2 times the differenee between the fitted model and the saturated model. 
Put simply, D{6) measures how well a model has fit the data relative to a model that fits the data 
perfeetly. We estimate D{6) with D{6), whieh ean be written as 

^ = D(0)+po, (4) 

where pn = D{9) —D{9). The estimate D(0) is known as the penalized deviance, as it is eomputed 
as the sum of D{9), the mean devianee, and po, the penalty term. The term D{9) measures how 
well a model fits the data, with lower values indieating better fit, and the term po penalizes this fit 
for more parameters, where higher values indieates a larger penalty. The penalty term po is also 
known as the effective number of parameters, so that one may interpret this term as an estimate 
of how many parameters the model is aetually estimating in order to deseribe the data. This is to 
aeeount for the expeeted outeome of models with more parameters fitting the data better and thus 
potentially over-fitting the data. 

Given in Table 1 is the ealeulated mean devianee, penalty, and eonsequent penalized devianee for 
various models. This is provided in order to justify the seleetion of our final model, as we ehose 
the model with the lowest penalized devianee. The variables Yc and Y^ represent the eontinuous 
and diserete parts of the model given in Equation (1), respeetively. That is, Yc are the values 
produeed by the beta distribution, and Yd are the 0-1 variables that either indieate a zero (1) or a 
eontinuous variable (0). All added eomplexities are in addition to the baseline linear model speeified 
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in equations (5-12) below. Let Yij be the scaled logjo^c value of the event. Also, let tij be the 
corresponding time until TCA (in days). 


^ij ^ f (yt j I j (5) 

= po + pitij + bi ( 6 ) 

bi^N{Q,Xb) ( 7 ) 

T^, ~ Gamma(0. 001, 0.001) (8) 

log = «o + aitij + Ui (9) 

a,-~A(0,Ta) (10) 

Tfl ~ Gamma(0. 001, 0.001), (11) 

Pki^Ck ^ Normal{0,l), A: = 0,1,2. (12) 


Table 1 shows that adding a random slope to either 7^ or did not produce a better fit, nor did 
specifying a correlation between the random effects: 


Model 

Mean Deviance 

Penalty 

Pen. Deviance 

Linear 

-17.23 

74.93 

57.7 

Quad Term for Yc 

-23.02 

77.32 

54.31 

Quad Term for Y^ 

-26.78 

76.45 

49.67 

Quad Term for Yc and Yj 

-32.52 

79.23 

46.71 

QuadTerm for both, RanSlope for Yc 

-31.12 

81.07 

49.95 

QuadTerm for both, RanSlope for Y^ 

-36.91 

85.54 

47.63 

Cubic Term for Yc 

-31.89 

81.1 

49.21 

Quadratic, linear for phi 

-27.76 

80.87 

53.11 


Table 1. Model Selection Output 


Based on these results, we propose the following final form of the model: 
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YiJ 

^ f(yij\Pij,^ij,p) 

(13) 

log 

Pij ^ 
1 ~ Pij ) 

= Po + PPij + Pitfj + bi 

(14) 


bi 

^N{0,Xb) 

(15) 


Tb 

~ Gamma (0.00 1,0.001) 

(16) 

log(^ 

Pij \ 

= OJq + 0^1 0 j + 

(17) 


Ui 


(18) 


Tfl 

~ Gamma(0. 001, 0.001), 

(19) 



Normal(0,l) k = 0,1,2. 

(20) 


3.2. Inference 

3.2.1. Inference for a new data point 

Let 0 = (oq, cti, a2,j8o,j8i,j82 ,^,Tfl,Tfc) . Suppose one has data y and one wishes to prediet y* = 
log Pc at a new data point at time t*. One ean make an inferenee on y* by using the predietive 
distribution 

^(/k*,y)= [ g{y*\d,x*,y)7t{0\y)de, 

J@ 

whieh ean be estimated by using the posterior samples of a Markov Chain Monte Carlo (MCMC) 
simulation, t* ean be a future time for whieh estimating the Pc would be desirable, or it ean be set 
to the time of the next reeeived Pc value in order to evaluate the model’s predietive power through 
residual analysis. In eondueting evaluations within the latter paradigm, we eonstruet a 95% eredible 
interval for y* and eheek to see if the aetual value of y is eontained in the interval. The pereentage of 
eredible intervals whieh eontain the true y value is known as eoverage. If the eoverage is elose to the 
nominal value of 95%, we ean assume that these predietions are reliable. These predietions are made 
starting with the seeond Pc observation for eaeh event, as was done in our previously-refereneed 
efforts. 

3.2.2. Issues of Identifiability 

The model proposed in equations (13)-(20) has a total of 7 parameters and 2 random effeets, whieh 
suggests one must estimate a total of 9 quantities in order to make inferenees and henee predietions. 
However, this issue ean be ameliorated by using informative priors in a Bayesian framework. 
To aequire these informative priors, we run the proposed model on a training dataset of a large 
number of events, whieh are not used for model evaluation. We use the posterior distribution of 
the parameters as informative prior distributions by matehing sample moments of the posterior 
samples with its prior distribution family. We do this for all of the population-level parameters, 
whieh are phi, a^, and j5k for k = 0, 1, 2. Then we are left with two parameters to estimate, the 
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random intercepts a/ and bj. Because we make predictions beginning with the second observation, 
these parameters are identifiable when making inference on a single event. 

Motivated by the large number of events in our testing data set, we investigated whether prediction 
could be improved by making inferences on more than one event at once. In order to test this, 
we followed the mean squared prediction error (MSPE) when making predictions on one event, 
5 events, 10 events, and 25 events. Including more events did not improve the MSPE, likely due 
to the fact that, in reference to a single event, other events contribute only to the population-level 
parameters, which are already well known due to the informative prior distributions. Thus, it is 
quite adequate to make predictions on a single event at a time. 

4. The Look-Up Method 

4.1. Intuition 

As a basis of comparison to our Beta regression model, we propose a simple alternative model, 
which we call the “Look-Up Method.” The Look-Up Method is based on the common expectation 
that, when one observes an event with relatively high Pc values, one can suppose this event to 
behave similarly to other events with other similarly high values. In order to formalize this intuitive 
approach into an explicit model, we need to establish how high “relatively high” is. A natural way 
to quantify this notion is in terms of quantiles. That is, we expect events with log Pc values in the 
<7th quantile to behave similarly to other events with log Pc values in the qth quantile. The method 
we describe below is similar to methods involving “look-up tables,” where one has quantiles for 
various scenarios and can look up the probability of an event within the table. 

4.2. Method 

Let X and y be the time and Pc value from the most recent observation. Lurthermore, let and 
represent the time of prediction and the true Pc value at this time. The algorithm for the Look-Up 
Method is as follows 

Algorithm 

1. Choose an historical data set Y/, such that the events contained in Y are believed to behave 
similarly to the event of interest. 

2. Choose a window w. 

3. Calculate the empirical CDL F{y) of the log Pc values in the interval {x-w,x-\-w). 

4. Calculate the sample quantile qoiy 

5. Calculate the empirical CDL F{y) of the log Pc values in the interval -|- w). 

6. Predict to be 

We find that w = 0.5 days to be a reasonable window length, as it is a serviceable adjudication 
between compressing the time-span enough to contain data for only a single developmental stage 
yet be broad enough to allow a reasonable amount of sampling. This length depends on how much 
prior data is at hand, as w may need to be larger for datasets which are less dense at the time of 
interest. Note that this method only predicts an estimate of y”®’^ and does not by default generate a 
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prediction interval or any other confidence information. 

The method above is simple: find the sample quantile of the observed Pc value at the given time, 
and assume that future Pc values will be at the exact same quantile. While simplicity has its virtues, 
in statistical analysis simple models often discard potentially useful information. For instance, the 
predictions are made based only on the sample quantile of the most recent observation and make no 
use of previous observations within the given event. However, one could argue that the most recent 
observation is the most (or only) meaningful observation, and thus one should make inferences 
based on this value rather than more immediate past values. 

4.3. Prediction Intervals 

As noted above, the Look-Up Method does not automatically generate prediction intervals; this is a 
consequence of the method making no distributional assumptions. However, one may still construct 
prediction intervals via bootstrapping or cross-validation[8]. These methods have been formally 
compared[9], and the results from this investigation indicate that estimators based on Repeated 
Cross Validation (RCV) tend to outperform other estimators (c.g., bootstrap estimators). As a result, 
we implement RCV to generate prediction intervals. The method was initially proposed by Burman 
(1989)[10], which describes the algorithm in detail. 

5. Measures of an Effective Model 

In this section we discuss the manner in which we will compare our two models. We focus on 
model fit and decision-making performance. 

5.1. Model Fit 

The main concern in building predictive models is fitting the data well enough to predict new 
observations accurately. In order to quantify model fit performance, we check the bias, prediction 
errors, and upper bounds of the proposed models. Specifically, we would like our models to be 
unbiased, so that the prediction errors are centered at zero. Secondly, we check to see if the 
prediction intervals are bigger or smaller for different times, predicted values, and prediction 
intervals. Lastly, we check to see that our upper bounds have the correct coverage. 

5.1.1. An Aside: Coverage from an Initial Simulation 

In an initial exploratory simulation, we found that 97.5% prediction intervals constructed in the Beta 
model had 86% coverage. Though coverage with real data is often lower than the nominal rate, this 
coverage level is low enough to bring into question the model’s operational utility. In investigating 
this phenomenon more deeply, we found that dividing the evaluation data into three parts, a high-, 
medium-, and low-risk group, ameliorated the issue of low coverage. Specifically, if an event had 
a high (above -4) logRc value by 3 days’ time to TCA , we called it high-risk. If an event had a 
medium (between -7 and -4) log Re value by 3 days’ time to TCA, we called it medium-risk. If an 
event had a low (below -7) logRc value by 3 days’ time to TCA, we called it low -risk. We shall refer 
to the high-, medium-, and low-risk groups as Red, Yellow, and Green hereafter. 
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Incidentally, the fact that our model performed well when the data were separated into different 
risk groups supports the notion that the log Pc value behaves differently depending on the quantile 
it inhabits. In terms of the Beta regression model, this successfuul stratification of data implies 
that the population-level trend is different for these different risk groups, which counsels that they 
ought to be modeled separately. In future work, we plan to explore more rigorously exactly how 
one should define these different risk groups. For the simulations presented in this paper, these 
definitions worked well and possess the additional advantage of aligning closely with thresholds 
presently in use operationally for categorizing conjunction event severity. 

5.2. Decision-Making Efficacy 

5.2.1. Framework 

In order to assess our models within the framework of making decisions about whether to continue 
active monitoring of a conjunction event, we implement a simple decision-making paradigm 
and study its properties in both models. Because the typical period for conjunction assessment 
operational decision-making occurs 2-3 days; time to TCA, we focus on this region of the data. 
Specifically, we make predictions at 2 days’ time to TCA and take a decision based on this prediction. 
Let be an estimate of the logFc predicted to occur at 2 days’ time to TCA. We will make the 
decision that the logic values will remain above the threshold 6 after 2 days; time to TCA if 

h>0 ( 21 ) 

and will make the decision that the logFc values will fall below the threshold 0 otherwise. To couch 
this in the hypothesis testing framework, we write 

Ho:y2<9 vs. Hi \y 2 > 9, (22) 

so that rejecting Hq is synonymous with claiming the logFc will remain high. In our simulations, 
we set 9 ~ —5 for the Red group and 9 ~ —1 for the Yellow group. Note that while -7 is the lower 
bound for being in the Yellow group at 3 days’ time to TCA, -5 is below -4, the corresponding 
lower bound for the Red group. A lower threshold was chosen as these events are generally of much 
higher concern, thus one prefers an extra order magnitude of certainty before claiming the event is 
at a lower risk level. In order to explore this trade-off fully, we examined various quantiles of the 
distribution of y 2 , which we describe below. 

5.2.2. Type I and Type II Errors 

As with most decision-making frameworks, our framework can admit Type I and Type II errors. 
The hypothesis in (22) is framed in terms of the event of a Pc value remaining high, as this is the 
event we are most concerned with. A Type I error in this case is the incorrect assertion of a high Pc 
value (i.e., a false alarm), and a Type II error is a the more worrisome incorrect prediction of a low 
value (i.e., a missed detection). Thus, while we may find it acceptable to trigger an alarm when the 
logRc value has actually dropped off, it is almost never acceptable to miss detecting a high logRc 
value. Of course, we can make our system as powerful against missed detections as we want, with 
the trade off of triggering more false alarms. It is worth noting that a false alarm for a high value is 
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the same thing as missed deteetion for a value whieh has dropped off. Ideally, we would like to 
have an alarm that deteets high values and low values with a high degree of aeeuraey. However, 
sinee we are more eoneerned with high values, we seek to quantify how often, if ever, ean we deteet 
these low values while still maintaining the high aeeuraey needed for deteeting the high values. 

6. Numerical Results 

6.1. Data 

The datasets previously mentioned for tuning (i.e., setting the parameters for the informative prior 
distributions) and testing the model was taken from the NASA Conjunetion Analysis and Risk 
Assessment historieal Conjunetion Message database. For the Yellow group, five hundred events’ 
worth of data from ealendar year 2013 was used for model tuning (the “training” dataset), and the 
tuned model was evaluated against approximately 2000 events from 2014 (the “validation” dataset); 
so there was no overlap in terms of time-period or aetual data between the two datasets. For the Red 
group, 82 events were used for training and 70 were used for testing (this data set is far smaller, as 
these kinds of events are more rare). Data were taken from eonjunetions against primaries in the 
orbital region defined by a perigee height between 500 and 750 km and an eeeentrieity less than 
0.25. As mentioned perviously, data flooring at a log^oRc value of -10 was performed on the dataset. 
To qualify for use in tuning or evaluation, an event must have had at least two CDMs with a logipRc 
greater than -10. 

6.2. Simulation Setup 

As deseribed earlier, to train our Beta distribution model, we perform a Bayesian analysis on 
the training data using non-informative priors. We determine the distribution parameters for the 
informative priors used in the test data by matehing the first and seeond moments of the observed 
distributions to the hypothesized prior distributions. These informative priors are then used in the 
predietive simulations with the validation dataset. All MCMC inferenee is eondueted using the 
JAGS (“Just Another Gibbs Sampler”)[ll] software suite. 

The simulation proeedure for a given event is as follows. We attempt to make predietions for the 
peak y value only after the seeond reeeived Pc ealeulation. We are interested in estimating the next 
log Re value, whieh we prediet by using the time at whieh it was observed. The predieted value 
is taken to be the mode of the posterior predietive distribution. In this eontext, it is important to 
use the posterior mode as opposed to the posterior mean, as the posterior predietive distribution is 
generally bimodal, with some mass at -10 and the remaining density between -10 and 0, indueing 
another peak. Thus, we ehoose the “most likely value” as opposed to the mean. The predietions 
using the Look-Up Method are performed in the straightforward manner deseribed in seetion 4.2. 

To further assess model fit, we also traek a two-sided 95% eredible interval for eaeh predietion. 
We utilize the upper bound from the eredible set for eheeking eoverage. This is also done for the 
Look-Up Method, though here the interval is a eonfidenee interval and is ealeulated using repeated 
eross-validation. In addition to eoverage, we are also interested in how many of these upper bounds 
are low enough to be “useful”. That is, we would like to know how many of these lower bounds are 
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lower than the lower threshold of the Yellow and Red groups. 

6.3. Results 

Below we discuss model fit for a simulation run on the Red data set. The decision-making efficacy 
is discussed relative to the Yellow data set. 

6.3.1. Coverage 

The upper bounds we constructed for both the Beta and Look-Up models achieved 97.6% and 97.4% 
coverage, respectively. The fact that these both achieve the nominal coverage of 97.5% suggests that 
both models have been specified properly and are reliable in creating prediction intervals. Figure 
3 shows the relationship of these upper bounds with time to TCA. Notice that because the upper 
bounds were found via cross-validation for the Look-Up Method, they do not always conform to 
the distribution of the data and can yield physically-impossible positive values. Also, the Look-Up 
Method method is better at detecting a logF^ of -10, and thus one observes many upper bounds at 
-7.2 (the 97.5% upper bound on errors was 2.8). 

Figure 4 plots the probability densities of these upper bounds on top of each other. Again, we see 
the positive upper bounds from the Look-Up Method, as well as a large density of lower bounds 
between -8 and -6. These figures highlight deficiencies in bofh models: in the Look-Up Method 
some upper bounds are above 0, and in the Beta Regression method not enough upper bounds are in 
the -10 to -7 (Green) region. 

6.3.2. Prediction Errors 

To get a better understanding of the model fit, we can inspect the prediction errors. Figure 5 shows 
the prediction errors plotted against time. In both cases, they are unbiased (have essentially zero 
means) and may increase slightly in variability near TCA. Again, we can see that the Look-Up 
Method is competent at detecting log Pc values, as evidenced by the large number of residuals equal 
to 0. This is the case of a -10 being predicted by a -10. 

We also examine predictive performance as a function of prediction interval. Figure 6 shows the 
residuals against the time until the predicted value. Most of the predictions are made within a 
day, suggesting that most event updates occur in this time frame. There seems to be no real trend, 
except possibly a slight decrease in variability as the time between the current observation and the 
prediction goes to 0, which is to be expected. 

Next, we explore the relationship between the residuals and the actual log Pc value. It is apparent 
from Figure 7 that both models are somewhat biased. This bias is a result of the nature of the data 
and the models: it is impossible to observe a value below -10, and in both models is impossible to 
predict a value below -10. Thus, when one errs in these cases, one always errs high. The residuals 
from the Look-Up Method are somewhat better behaved in this setting, leveling out to a mean of 
zero around an actual log Pc value of about -5 or -4. The Beta regression residuals do not level off in 
this way until an actual log Pc value of about -3 or -2. 
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Figures 8 and 9 show a few more diagnosties for the residuals. It is worth noting that in Figure 9, 
there are a large number of predietion errors that are 0 for the Look-Up Method and that are large 
negative predietion errors in the Beta regression model. This alignment again has to do with the 
Look-Up Method eorreetly identifying the logic values of -10. 

6.3.3. Making Decisions 

Figure 10 is a Reeeiver Operating Charaeteristie (ROC)-like eurve demonstrating the properties of 
the alarm system for deteeting a high value in the Yellow group. Here, our threshold is 0 = — 7. 
Notiee that while the Beta model has a smaller distanee between the eorreet deteetion of a high 
value and a false alarm, the Look-Up Method is less able to deteet high events eorreetly. In praetiee, 
one would like to have an extremely high (say, 90-99%) rate of eorreetly deteeting high events. 
Though the Look-Up Method is a potentially more useful alarm system at lower rates of eorreetly 
deteeting a high value (it triggers fewer false alarms relative to the Beta model), its inability to 
eorreetly deteet high values makes it virtually unusable in praetiee. 

To further visualize this phenomenon, eonsider Figure 11. This is an ROC eurve whieh plots the 
results of Figure 10 in a elassie true positive vs. false alarm setting. Again, here a true positive is a 
eorreet deteetion of a high event. As we are interested in 90-99% eorreet deteetion of high events, 
we eonsider the upper right-hand portion of the graph. Notiee that the Beta model has numerous 
points in this region (with false alarm rates ranging from about 70-100%), while the Look-Up 
Method only has one point: 100% true positives with 100% false alarms. That is, one triggers an 
alarm for every event. The Beta model has many points in this region, and they are all above y = x, 
indieating that the model eorreetly identihes a high value more often than triggering a false alarm. 

It should be noted that in Figure 1 1 that the lower left-hand portion of the graph indieates that, 
beeause of the relative performanee of the true positive and false alarm rate, the Look-Up Method 
is a better alarm system in this region. This improved performanee eorresponds to the 0th to 20th 
pereentile region from Figure 10, whieh have a higher rate of eorreetly deteeting a high value. 
Although we are not interested in this area of the graph as the probability of deteetion is too low to 
be useful, it is worth noting that this again shows that the Look-Up Method has some advantages 
relative to the Beta model, indieating that possibly an even better model eould be eonstrueted. 

Lastly, one may wonder why the Look-Up Method is unable to eorreetly deteet high values at a 
higher rate than shown. This is likely due to the faet that the method is ad-hoe, and thus there is no 
guarantee that the model will produee meaningful predietions or predietion intervals. 

7. Conclusion and Future Work 

In this paper, we have presented two models for predieting future logFc values in eonjunetion 
assessments. The Look-Up Method was proposed as a referenee method, and elearly has many 
desirable properties. The method is fast to eompute an relatively easy to implement if one has 
suffieient data. One of the major drawbaeks of this method is that it is ad-hoe, so it may be diffieult 
to justify theoretieally. Consequently, though this method had better predietion errors and other 
useful properties, the model was not able to be used in a deeision-making eontext in a meaningful 
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way. The Look-Up Method ean be reeommended as a quiek way to make relatively aeeurate 
predietions, but one should be eautious in using it to make deeisions without further development. 

The Beta Regression method was proposed to handle the bounded nature of the data. Ineluding a 
model for the exeess -10 values proved useful, though the model often is eonservative in predieting 
a -10 value. Thus, more exploration needs to be done to borrow strength aeross these two models, 
in the hopes of aehieving tighter predietion intervals and ultimately better deeision making. Though 
the model is inferior in some ways to the Look-Up Method, it ultimately performs better in a 
deeision-making eontext and therefore ean be operationally useful. Its theoretieal underpinning 
allows one to make probabilistie statements about future events, whieh have been validated through 
simulation. Furthermore, the theoretieal underpinning allows one to eonstruet meaningful predietion 
intervals of any size, resulting in an ability to make deeisions at various true positive and false alarm 
levels. This model would thus give operators the ability to foreeast aeeurately and with eonhdenee, 
espeeially knowing what the eharaeteristies of the alarm system are (shown through simulation), 
although it would need to be determined whether a elassified with sueh a large false-alarm rate 
would aetually be operationally useful. 

Our future work is foeused on exploring the findings from these simulations more in order to 
eonstruet a more powerful predietive model. This paper has shown that the quantile of the eurrent 
observation is a useful pieee of information, and we believe ineluding it as a eovariate or threshold 
meehanism eould lead to better results. We also eontinue to explore methods for longitudinal data, 
thus deploying more powerful ways of borrowing information aeross time and events. 
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